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Mutant Recombinases 

IIEIJ> OF THE INVENTION 

The present invention relates to a method of identifying hyperactive mutant 
recombinases and such mutant recombinases, as well as hybrid mutant recombinases. 
The present invention also relates to vectors comprising nucleic acid encoding said 
recombinases, as well as cells, especially eukaryotic cells capable of expressing said 
recombinases and carrying out site-specific recombination in the cell. Uses of said 
recombinases in biotechnology and/or gene therapy/transgenic applications is also 
provided, as well as novel recombination systems in a cell such as a eukaryotic cell, 
especially mammalian cell. 

INTRODUCTION 

Site*specific recombination is extensively used for genetic manipulations in 
vivo, and is central to many proposed approaches to gene therapy (ECilby et al, 1993; 
Nagy, 2000). Jt is generally used to site-specifically introduce or excise a DNA 
fragment (for example an engineered cassette) into or from the genomic DNA, in a 
controlled way (for example, at a specific stage of development, or following 
deliberate induction of the recombinase). Nearly all current applications of site- 
specific recombination in eukaryotes use the toxP-Cre system from bacteriophage PI 
(see review by Nagy, 2000). Cre is a good recombinase for these purposes because of 
its short DNA recombination site (loxP; 34bp), its stability in vivo and the robustness 
of its activity even in chromatin-associated DNA Use of these site-specific 
recombination systems in eukaryotes depends on the introduction of target DNA 
containing the appropriate DNA recognition/recombination sites into the organism. 
However, there is a great deal of current interest in modifying site-specific 
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recombinases so as to recogmse natural sequences in eukaryote (e.g. human) genomes 
(Santoro and Schultz, 2002; Scimenti ei a/., 2001; BucKholz and Stewart, 2001). 

The bacterial transposon Tni, a member of the large *serine recombinase' 
family, encodes a site-specific recombination system comprising a 114 bp DNA site 
res^ and a serine recombinase resolvase. res contains three binding sites for resolvase 
dimers. Recombination takes place within a *synapse', consisting of the intertwined 
pair of resolvase-bound res sites that are to recombine. Strand exchange occurs at the 
cmtre of the two binding site Is, and is catalysed by the resolvase dimers bound at site 
I. However, wild-type resolvase is inactive on a substrate containing just two site Is; 
the presence of the 'accessory' resolvase-binding sites, 11 and m, hereinafter referred 
to as acc (Blake, 1995), in each res is essential for normal activity. 

The acc sequences and the resolvase subunits bound to them play an essential 
part in the imposition of these selectivities (reviewed by Grindley, 2002). Regulatory 
DNA sequences like acc are prevalent in natural site-specific recombination systems. 
They may be adjacent to or distant from the site of crossing over, and may bind 
subunits of the recombinase (as acc does) and/or other proteins. Their functions are to 
ensure that recombination occurs only at the right times and places (reviewed by 
Nash, 1996). 

The 20 kDA resolvases of the transposons TnJ and yS are very similar (147 of 
185 residues are identicaO. X-ray crystallography has yielded high resolution 
structures of y5 resolvase, both on its own and in a complex with site I of res 
(Sanderson etaL, 1990; Rice and Steitz, 1994; Yang and Steitz, 1995;). However, the 
structure of the synapse is still not well-defined, despite much analysis. To build a 
functional synapse, at least three types of resolvase-resolvase interaction are thought 
to be required, two of which are represented in crystal stnictures. The 1,2 interaction 




(Elughes et al.^ 1990) forms the resolvase dimer that is present in solution and 
complexes of resolvase bound to parts of res; it is .found in all of the three published 
crystal structures. The role of the 2,3* interaction between resolvase dimers, seen in 
crystals of the yS resolvase protdn but not the DNA-resolvase co-crystal, is more 
elusive. Mitation of single residues at this interface eliminates recombination 
activity. The mutants are defective in cooperative binding to res^ and in synapsis 
(Hughes et al, 1990; Murley and Grindley, 1998). The 2,3' interaction is an essential 
feature of several proposed structures for the synapse (Rice and Steitz, 1994; 
Grindley, 1994; Yang and Steitz, 1995; Murley and Grindley, 1998; Sarins et al, 
2001; Rowland et aL, 2002). A third interaction, not observed in any of the crystal 
structures, has been proposed to be required in order to bring two 1,2 dimers together 
in an arrangement suitable for catalysis of strand exchange at site I (Rice and Steitz, 
1994; Yang and Steitz, 1995). This third interaction may also have other "non- 
catalytic" roles in synapsis. In the recent synapse model of Sarkis et al. (2001), the 
protein core comprises three "DNA-out" tetramers, interacting vnth each other at 2,3* 
surfaces. 

In published work (Schwikardi and Droge, 2000), y5 resolvase and a mutant of 
it have been shown to be active in mammalian cells, on full res sites and (very 
inefficiently) on the 2S bp site I of res. Another related recombinase. Gin, has been 
shown to be active in plant protoplasts (Maeser and Kahmann, 1991). Moreover 
earlier work by the present inventors describes mutants of TnJ resolvase that act on 
28 bp site I (inefficiently) in E. coli or in vitro, (Arnold et al,, 1999). Some more 
recent work has been disclosed in Sarkis etaL 2001). Nevertheless, there has been no 
disclosure of potential or actual use of these mutants in other organisms, e.g. 
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mammalian cells, or for genetic engineering purposes and moreover it is desirable to 
develop better mutant recombinases than those hitherto described. 

While the^ concept of directing an enzyme to a chosen DNA sequence by 
attaching a DNA-binding domain from an unrelated protein may not be new; there are 
seve^ examples of enzymes that have been fused to the Zi£268 DNA-binding 
domain or derivatives^ in order to direct activity to a new site, for example (Bibikova 
et al, 2002, and references cited therein), this has not been done until now for any 
site-specific recombinase, because it is not obvious how to do it. However, sequence 
recognition by site-specific recombinases has been altered by mutagenesis, or by 
swapping domains between related proteins. The tyrosine recombinases Cre and FLP, 
for example, have been extensively mutated to try to achieve new sequence 
recognition, with partial success (Buchholz and Stewart, 2001; Santoro and Schultz, 
2002; and references cited therein). Cre/Flp hybrid proteins with unusual properties 
(but no recombination activity) have been created (Shaikh and Sadowsld, 2000), and 
phage lambda integrase has also been 'spliced' with a closely related protein in order 
to alter sequence recognition (Nunes-Duby et aL, 1994). Htowever, for all the tyrosine 
recombinases, it is not obvious how the DNA-binding and catalytic functions of the 
protein could be separated, so changing recognition completely by attaching a 
heterologous DNA binding domain or similar is currently implausible. 

For the serine recombinases, the crystal structure of yS resolvase bound to site 
I DNA (Yang and Steitz, 1995) shows that the 'DNA-binding* and 'catalytic' 
domains are folded separately and do not make an intimate interaction. It was 
previously known that the C-terminal 45 residues of Tn5/y5 resolvase (141-185) are 
largely responsible for specific DNA recognition, and that residues 1-140 contain the 
known catalytic functions. However, there is no suggestion in the art that catalysis 
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could be achieved without the natural C-terminal domain or some other similar 
domain. It was unlikely that specific catalytic residues were in the C-terminal domain, 
because several hybrid recombinases were active. Li these hybrids, the C-terminal 
domain was exchanged for that of another quite closely related serine recombinase. 
The junction was so as to conserve exactly the positions of residues that were 
homologous in the two parents. Examples of hybrids were between parts of Tn3 and 
Tn27 resolvases, or TnJ and Tn5J2 resolvases, or Tni and y6 resolvases, or Gin and 
ISXcS resolvase (Avila et al., 1990; Schneider et aL, 2000). Nevertheless, all of these 
hybrids were active only on long DNA sequences (fiill res sites), not on a short 
sequence like site I, and that only small changes in sequence recognition were 
achieved. 

It is an object of the present invention to obviate and/or mitigate at least one of 
the aforementioned disadvantages. 

It is another object of the present invention to provide novel mutant 
recombinases which may find use in gene therapy and/or other biotechnological 
applications and/or develop uses of mutant recombinases not hitherto suggested. 

SUMMARY OF THE INVENTION 

In a first aspect the present invention provides a method for identifying a 
hyperactive mutant serine recombinase capable of catalysing site-specific DNA 
recombination when bound to a recognition site comprising fewer nucleotides than 
necessary for achieving recombination with a corresponding wild-type serine 
recombinase, comprising the steps of 
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mutating said wild-type serine recombinase such that the mutant recombinase 
comprises one or more mutations,, in a catalytic domain of the recombinase, with 
respect to the wild-type serine recombinase; and 

detecting whether or not said mutant serine recombinase is capable of 
catalysing DNA recombination when bound to said recognition site comprising few^ 
nucleotides than necessary for achieving recombination with the coiresponding wild- 
type serine recombinase. 

The term hyperactive mutant recombinase is used to indicate that the mutant is 
capable of recombinase acti^dty at smaller recognition sites than required by a wild- 
^e recombinase. 

Generally speaking recombination is carried out such that two such 
recognition sites are brought into close proximity for site-specific recombination to 
occur. Site-specific recombination is understood to relate to genetic recombination 
occurring between two particular^ but not necessarily homologous, short DNA 
sequences, as in the integration or excision of phage DNA from a bacterial 
chromosome or in transposition. It is likely that more than one detection step iBrom 
wild-type to preferred mutant may be required. That is it may be necessary to first 
select mutants with a substrate comprising one wild-type recognition site and one site 
of reduced size and to the further mutagenise suitable mutants to get a preferred 
h3q>eractive mutant which shows recombination activity at two sites of reduced size. 

- Conveniently the sites of reduced size comprise less than SO nucleotides, 
typically less than 30 nucleotides. 

The present invention describes in one embodiment recombinases derived 
fi-om Tni resolvase, by combining mutations as indicated below, that efficiently 
recombine two sequences corresponding to the 28 bp binding site I of Tni res (or 
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minor variants thereof). The D102Y E124Q mutant desoribed in Arnold et al 1999 
has weak activity on a site I x site I substrate, in E. coli or in vHro\ insufficient to be 
useful and is not therefore encompassed within the scope of the present invention. 
Much more active mutants were created by combining mutations in the re^on close 
to D102. Additionally, all the most efScient versions are mutant at D102. The pres^ 
inventors have tested all possible residues at position 102; the effects of the single 
mutants are, in decreasing order of hyperactivity, Y, I > F, T, V, W > A > all otiiers. 
Mutation of GlOl has also been observed to cause a big efifect; specifically to serine 
(GIOIS). Thus, mutants according to the present invention preferably comprise 
mutations at D102 and/or GlOl or corresponding residues from other serine 
recombinases. Mutations at other residues can also promote hyperactivity; these 
include (in approximate order of strength of efifect) V107M, V107L, Q105L, A117V, 
R121K, E124Q, B124A, A89T, F92S, M103L Preferably the mutant enzymes have 
combinations of two or more of these mutations. 

It has also been found that mutations of resolvase sur&ce residues 
corresponding to a '2,3' inter&ce' enhanced the acti^ty of the mutants; see 
hereinafter The mutations that have been tested were R2A and ES6K, but mutation of 
several nearby residues (Hughes ei al, 1990) might be similarly effective. Thus, 



preferably the mutants of the present invention also comprise at least one mutation 
that eflfects the 2,3 interface. 

Whilst the present inventors have focussed their work on Tn5 resolvase, it will 
be appreciated that the scope of the present invention may easily be extended to other 
serine recombinases, due to the similarity between members of the fiimily. 

The serine recombinases comprise a large family of related enzymes, which 
can be identified by sequence homology using standard algorithms such as BLAST. 
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Several residues are completely conserved, or nearly so, throughout the family. 
Structural features corresponding to particular parts of the primary sequence can be 
cKafactenze'd because there are hSgh-rescsrutrdn aystat slfuctures 6f the complete yS 
resoivase protein, and a fragment of Hin, as well as a large body of other biochemical 
data that give information on the structures. Those skilled in the art can easily identify 
the residues in other serine recombinases that might correspond to the Tn5 residues 
'which can be mutated to cause hyperactivity. For example, residues GlOl and D102 
are the two TnJ resoivase residues immediately preceding the N-terminus of a long a- 
helix, the E-helix of Yang and Steitz 1995, that contributes to the dimer interfiice. The 
equivalent residues can be identified in most other members of the serine recombinase 
family. Similarly, residues corresponding to those involved in the 2,3' interaction can 
be identified. See for example the re^ew by Smith & Thorpe, 2002, or the attached 
alignment Figure 1 which shows an alignment of the N-terminal catalytic domains 
from a number of serine recombinases. For example, the present inventors have 
preliminary evidence that equivalent mutations of Sin recombinase from 
Staphylococcus aureus, which is quite distant from Tn5 resoivase, have the predicted 
effects. 

The hyperactive mutants described herein can utilise the *Site I' sequence for 
recombination. The 'Site I' is a 28 bp sequence from the natural res recombination 
site. Desirably smaller regions could be used which still cause recombination to occur. 
This may depend however on the mutant developed, but this can easily be determined 
by the skilled addresse. In practice, however, the sequence will always be embedded 
in a longer DNA molecule. It has been observed that many bases can be mutated 
individually without serious loss of recombination activity, and even multiple changes 
may not be very deleterious. However, a site comprising only the central 16 bp of site 
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I (that is, 6 bp at each end replaced so that no bases are conserved), or <16 bp, is not a 
substrate for the hyperactive mutant resolvases described herein. 

Advantages of hyperactive serine recombinases over currently- available en^rmes 
for genetic manipulation 

1. They act at short DNA sites, and do not require specific site orientation or 
supercoiling. They are therefore 'better' than other serine recombinases previously 
proposed for these uses. (Long sites and other requirements make it much more 
difScult to set up suitable constructs etc., and affect reactivity in chromatin-assodated 
DNA). 

2. They do not interact with tyrosine recombinases such as Cre or FLP, and act 
at different sites. So they can be used in applications where two (or more) 
independent recombination systems are required (see reviews etc.). 

3. They may have advantages in real systems, because of their different 
properties and mechanism. For example, they might be more easily expressed/more 
stable in mammalian cells, or they might give more complete recombination. 

In a further aspect the present invention provides a hybrid mutant recombinase 
comprising an N-terminal catalytic domain from a serine recombinase connected by 
way of a linker region to a heterologous C-terminal DNA binding domain wherein the 
mutant recombinase is capable of binding nucleic acid by way of said DNA binding 
domain and said mutant recombinase catalysing recombination. 

Preferably the catalytic domain is from a hyperactive mutant recombinase 
identified, for example, according to the present invention. 

It was previously known that the N-terminal domain of Tni resolvase (or any 
other serine recombinase tested) has no catalytic activity on its own, nor does the 
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isolated N-terminal domain of mutants that act on site L It was therefore surprising 
that attachment of an unrelated DNA-binding domain to a mutant catalytic domain 
could restore acti>dty at a very different DNA site: Reasons why this might not have 
been considered feasible are: 

1. The natural DNA-binding domain might play some essential part in the 
reaction mechanism, which could be performed by a related DNA-binding domdn, 
but not an unrelated one; e.g. involvement of conserved residues, or transient 
dissociation from its binding site. 

2. The natural domain might not participate in the reaction, but its size, shape, 
and position might be critical. For example, a larger domain might interfere with 
essential conformational changes in the DNA or protdn. 

3. The nature of the linker sequence between the two domains might be 
critic^ and it might not have been possible to reconstruct it appropriately (for 
example, because the N-terminal residues of the unrelated DNA-binding domain and 
the resolvase DNA-binding domain were differently positioned relative to the binding 
site). 

In practise the important steps in going from a natural serine recombinase eg. 
Tni recombination system to a functional •hybrid' system are as follows: 

1. Identification of multiple mutants of resolvase that rapidly recombine two 
28 bp site I's, thereby removing the requirement for 'accessory sites' (see Arnold et 
aL, 1999; and the development of hyperactive mutants described herein); 

2. Deciding where to terminate the N-terminal domain, to separate DNA- 
binding from essential catalytic functions; 

3. Choosing of an appropriate substitute DNA-binding domain (e.g. Zi£268) 
(from literature analysis); 
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4. Designing appropriate linker peptide sequences^ to join the DNA-binding 
and catalytic domains of the hybrids; and 

5. Designing of potential recombination sites for the hybrid enzyme. 

In the hybrid recombinases of the present invention, the catalytic domain of a 
hyperactive mutant resolvase (or other serine recombinase) is joined via a short linker 
sequence to a DNA-binding domain from a different protein. The DNA-binding 
domain can be any of a number of such domains known to those skilled in the art, 
such as the domain from other serine recombinases, or from some transposases, or 
from bacterial repressors, tyrosine recombinases, etc. Suitably the DNA-binding 
domain may be eukaryotic in origin, for example, from eukaryotic transcription 
factors, especially a zinc finger DNA-binding domain such as that from Zif268, or 
variants of one of these with altered sequence recognition. 

The hybrids that have been constmcted to date, by the present inventors 
contain the contiguous first 146 residues of Tn5 resolvase, with appropriate 
^activating' mutations see hereinabove for information. (The proteins actually tested 
have all of the following mutations: R2A E56K GIOIS D102Y M103I Q105L 
although this should not be construed as limiting). The traditional 'catalytic* and 
DNA-binding' domains of resolvase and relatives were identified following 
proteolysis, and are residues 1-140 and 141-183 (for yS resolvase) respectively. The 
C-temilnal domain has been shown to retain DNA-binding activity, but no activities 
were found for the N-terminal 'catalytic' domain on its own. Current evidence 
suggests that all catalytic functions may reside in the contiguous residues 1-125. The 
sequence from 126-146 may however, contribute to binding and sequence recognition 
near the centre of the site. It is envisaged that it may be possible to mutate the 126- 
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146 region or replace it with the equivalent segment from another serine recombinase, 
to alter reactivity or target specificity. 

Preferably the linker region should be a sequence with structural flexibility, 
but the linker may depend strongly on the DNA-binding domain employed. This can 
however, easily be determined by the skilled addressee. It may be that shorter linkers 
will potentially lead to more efficient recombination, but nught be more restricted in 
sequence variation. Thus an appropriate linker may depend on the requirements of the 
user. Some linkers may increase the efficiency of recombination at the expense of 
DNA sequence specificity, whilst others may allow recombination to occur at lower 
eflSciency, but with a greater variation in sequence. 

Resolvase binds to site I as a dimer. To act at asymmetric sequences (see 
below), it will be deshrable to bind a heterodimer of the hybrid recombinase, where 
the DNA-binding domains of the two subunits interact with distinct sequence 
elements. Likewise, it may be desirable to have a different heterodimer to recognize a 
partner recombination site; i.e. up to four different hybrid recombinase proteins could 
be simultaneously involved, see Bibikova et al,, 2002 for an example of this type of 
approach in a different system). 

Although the hybrids of the present invention have been exemplified with 
respect to Tn5 resdlvase-derived systems, this should not be construed as limiting. 
Based on the present teaching similar procedures could be used to create equivalent 
hybrids from other serine recombinases. Indeed, this might lead to better 
recombinases, because other recombinases have different *site I* central sequences, 
which could be better for some specific natural sequences chosen to be recombination 
sites. 
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In order for a hybrid recombinase to function appropriately and catalyse 
recombination it is necessary for the enzyme to recognise and bind to an appropriate 
stretch of DNA. Typically the DNA sequence may comprise two regions recognized 
by the DNA-binding domain(s) of the hybrid recombinase(s), flanking a central 
sequence which may make some specific interactions with the catalytic domain and/or 
the 126-146 segment, or similar region from another serine recombinase. The site will 
always be embedded in a longer DNA molecule (typically, but not necessarily, 
kilobasepairs). 

A typical site may be about 40 bp long. Experiments by the present inventors 
indicate that the positioning of the sequence elements that recognize the DNA-binding 
domains (relative to the centre of the site) is very important. The ideal positions will 
certainly vary depending on the DNA-binding domain and linker sequence used. The 
sequences of sites that have been tested by the present inventors are shown in attadied 
Figure 2a. These sites all comprise two copies of the natural 9 bp motif that is 
recognized by Zif268, flanking a central sequence of varying length. All of the central 
sequences used so far contain at least 1 1 contiguous basepairs of identity to the centre 
of site I, but it is very likely that sequences with less similarity to site I will also be 
active. It should be noted that non-hybrid hyperactive mutant resolvases are not active 
on these sites. The main features of the recombination site are illustrated in the 
attached Figure 3. 

The two sites that are to recombine need not be identical. They could be 
recognized by separate hybrid recombinase heterodimers, providing that the catalytic 
domains were similar, so that the catalytically competent synapse of the two sites 
could be formed. Importantly however, the 2 bp at the centre of the sites should be 
identical for efficient reaction (this is because these bp form a 'heteroduplex' in the 
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recombinants^ and the basepairs would be mismatched if the 2 bp sequences were 
different). (Tyrosine recombinases require longer regions of identity at the centre of 
their sites; 6 bp for Cre, and 8 bp for FLP), Also, the relative orientation of this 
'overlap' sequence defines whether excision or inversion will occur between two sites 
in the same molecule. 

- Without being bound by theory it is predicted that, for any chosen 
recombination site sequence, it will be necessary to cany out an optimization 
procedure to achieve high activity. In general, this procedure vAM include the 
following steps. 

1. One or two candidate recombination sites will be chosen, which have a 
central sequence with some similarity to site I, flanked by sequences at appropriate 
distances firom the centre that could recognize selected DNA-binding domains; - 

2. The DNA-binding domains will be optimized for recognition of their 
targets. This can be done completely separately fi-om the recombination system, using 
methods well knov^m to those skilled in the art; mutagenesis followed by 'phage 
display' selection, swapping of parts firom known variants of the DNA-binding 
domain, etc. (see reviews; e.g. Pabo eiaL^ 2001); 

3. Likewise, the catalytic domains and linkers may be optimized for 
interaction vwth and recombination at the central sequences. This may be done by 
making a trial recombination site, with the chosen central sequence placed between 
motifs recognized by a DNA-binding domain that is known to work well; for 
example, Zif268 itself. The catalytic domain and linker will then be optimized in 
essentially the same way as in (2), using mutagenesis/selection methods (e.g. as 
described in hereia), or splicing of parts firom different variants or from different 
serine recombinases, etc; and 
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4. Complete candidate hybrid recombinases may then be assembled, and tested 
on the intact chosen sites. ST necessary, efficiency of recombination at the sites may be 
improved by further rounds of mutagenesis and selection. 

In a fiirthK" aspect there is provided use of a hyperactive mutant recombinase, 
or hybrid recombinase according to the present invention for carrying out site-specific 
recombination. 

Preferably site-specific recombination is carried out in a eukaryotic cell or on 
eukaryotic DNA. More preferably site-specific recombination is conducted in a 
mammalian cell or on mammalian DNA. 

In a further aspect there is provided use of a hyperactive mutant recombinase, 
or hybrid recombinase according to the present invention for the manufacture of a 
medicament for therapy or prophylaxsis. Said hyperactive mutant may be used to 
introduce a therapeutic gene or replace/remove a defective or deleterious gene 
sequence fi-om the genome of a particular organism, such as a mammal. 

In principle, all of the recombinases described herein could be used for 
virtually any current or envisaged applications of site-specific recombinases such as 
cell therapy, tissue engineering and/or gene therapy (see for example Gorman & 
Bullock, 2000 and references sited therein). The hybrid recombinases can also be 
used to create new sequence specificities in experimental systems, but more 
importantly, they can be used to target recombination to natural sequences in the 
genomes of (any) important organisms. 

Advantage and utility of hybrid recombinases 

Potential applications are for example in WO 01/16345. Basically, a DNA 
segment containing useful (e.g. therapeutic) genes can be introduced at specific 
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genomic shes, or 'bad' genes can be excised from the genomes of living cells, or 
control of gene function can be systematically alt^ed by excision, integration, or 
inversion of DNA segments. Two examples are: (1) It may be possible to develop a 
potential therapy for HIV and other retroviral diseases, by excision of the proviral 
DNA (see below); (2) it may be possible to introduce useful genes (e.g. for 
antibodies) at specific sites in for example the casein gene loci of cows, so that they 
would express large quantities of the gene product in their milk. 

Several groups are attempting to adapt the tyrosine recombinases Cre and FLP 
to recognize new sites (see, for example, Buchholz and Stewart, 2001; Santoro and 
Schultz, 2002). However, the present inventors believe that mutant serine 
recombinases are likely to be much more successful for this approach, because their 
modular structure facilitates the 'hybrid' constnictions described herein. Also, they 
are likely to be much more suitable for recombining between two natural sites (e.g. 
for excision of natural genes), because they require only 2 bp of homology at the 
centre of the recombination sites for efficient reaction. Cre requires 6 bp and FLP 
requires 8 bp (Nash, 1996); pairs of sites with this degree of identity will be very rare. 
Clearly the recombinase genes would need to be introduced into and expressed in the 
target cells, for most applications. Thus the present invention also provides a vector 
comprising a nucleic acid encoding a hyperactive or hybrid mutant recombinase 
according to the present invention. The vector may also comprise one or more 
recombinase binding sites which are recognisable by said mutant recombinase. Said 
recognition site(s) may comprise a mutated sequence with respect to the native 
sequence recognisable by the unmutated recombinase. The present invention also 
provides a cell which has been transformed with a vector as described above. It might 
be necessary to modify the recombinases for various reasons concerned with their 
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properties in the target cells, such as to increase stability, direct them to the nucleus, 
allow their visualization. All such modifications are well known to those skilled in 
the art. 

The present invention will not be &rther described by way of example and 
reference to the Figures which show: 

Figure 1 shows a sequence alignment of the N-terminal domain of a nimiber of 
serine recombinases, showing that they are quire conserved; 

Figure 2a shows details of Z-box sites which have been tested by the present 
invientors; 

Figure 2b shows details of the flexible linkers which have been tested by the 
present inventors; 

Figure 3 shows a schematic representation of a generic hybride recombination 

site; 

Figure 4 shows plasmids used for in vivo screening for mutants. The repA 
gene product is required for initiation of replication at the pSClOl origin. See Arnold 
etal. (1999) for further details. Pgal(re^ x res) is shown. In pGsl(res x T), res A has 
been replaced by a fragment containing site I, and in pGal(I x T), both res sites have 
been replaced by site I fragments. PStr(I x I) is similar to pGal(I x I), but the 
sequences containing the galK gene are replaced by sequences conferring resistance 
to tetracycline and sensitivity to streptomycin (see Materials and Methods section); 

Figure 4b shows in vivo properties of resolvase mutants. The colour of 
colonies on MacConkey agar plates is shown. Red signifies no resolution activity, 
pale yellow signifies full resolution, and pink signifies slow resolution. Some mutants 
gave mixtures of colonies of different colours (shown as a sectored circle). Detection 
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of weak activity of some mutants on certain substrates was variable, depending on 
factors such as colony density (red circles marked with a + sign); 

Figure 5 shows (Structures of pCTres^ pCresl, and pC2I = pMA21, pAL265, 
pAL225) 

a) Sununary of in vitro properties of some multiple mutants of resolvase. -=^o 
activity. Higher activity is indicated by more + signs. The in ^tro activities of 
D102Y, E124Q, and D102Y E124Q mutants are described in Arnold et al, (1999); 

Figure 6 shows the location of the mutants which have been carried out by the 
present inventors; and 

Figure 6b shows residues 100-125 of resolvase subunit A (Yang and Stdtz, 
1995), contmning the N-terminal section of the £-helix and the immediately preceding 
residues, are shown in backbone representation (green). The \dew is from the same 
angle as in Figure 2a. The sidechain of D102 is shown in blue, and the sidechains of 
other residues mutated in the hyperactive proteins are in red. Interactions of these 
residues are denoted by the thick lines. Residues in the same subunit are green, and 
those in subunit B are in orange. 

MATERIALS AND METHODS 
Mutagenesis 

Designed mutations were introduced by cloning appropriate double-stranded 
synthetic oligonucleotides into pAT5 (Arnold et al^ 1999), or pMA5811, which was 
derived from p ATS by deletion of an EcdBN-Nrul fragment. Random mutations were 
created using synthetic oligonucleotides as described in Arnold etal (1999), or by the 
polymerase chain reaction. Primers flanking the complete resolvase ORF of p ATS or 
pMASSll were used to amplify the fragment, and mutagenesis was caused by biasing 
the proportions of the dNTPs ^romant et al., 1995), or by introduction of 8-oxodGTP 
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or dPTP nucleotides (Zaccolo et aL^ 1996). Appropriate restriction digest fragments 
from mutagenized DNA were cloned into pAT5 or pMA5811 to create libraries of 
mutants which were screened as described below. 

Screening and selection 

Resolvase expression plasmids, in vivo expression, and the GalK-based 
screening method were as described in Arnold et aL, (1999). pGel(res x res), pGal(rej 
X I), and pGal(I x J) were described by Arnold et aL as pDB34, pDB37, and pDB3S 
respectively. Typically, between 1 000 and 10 000 candidate mutants were screened. 
The numbers were limited either by the diversity of the library, or by the screening 
procedure, in which 'white'' colonies could not be picked reliably when there were 
more than -1 000 colonies on a single 8 cm diameter MacConkey agar plate. Some 
mutants were selected by a method in which resolution of a test plasmid causes loss of 
tetracycline resistance, but confers resistance to streptomycin. In the test plasmid 
pStr(I X I) (=pMA5531), the galK gene of pGal(I x I) was replaced by sequences 
containing a gene for tetracycline resistance, and the strA {rpsL) gene, encoding the 
wild-QT)e ribosomal S12 protein from pABS12 . When highly expressed, S12 causes 
streptomycin-sensitivity in strains of £ colt that are normally resistant due to a 
mutation in the chromosomal copy of this gene. Agar plates containing streptomycin 
and kanamycin therefore select for loss of the plasmid-encoded strA gene by 
recombination between the two site Is. Libraries of p ATS containing mutant resolvase 
ORFs were used to transform E. coli strain DS941/pStr(I x 1). Liquid cultures (LB 
medium) were grown without selection for variable time intervals before spreading 
aliquots on L-agar plates containing kanamycin and streptomycin (200 \ig ml"^). 
Mutant versions of pAT5 were isolated from colonies that appeared at early time 
points. 
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RESULTS 
Mutation strategies 

The point mutation D102Y allows Tn3 resolvase to recombine a res x site I 
substrate. The double mutant D102Y Ei24Q can slowly recombine a site I x site I 
substrate, although it is still greatly stimulated by the presence of acc (in siresxres or 
res X site I substrate) (Arnold et cd,, 1999). The present inventors therefore adopted 
three approaches to find other activating mutations of resolvase: (A) random 
mutagenesis of the catalytic domain of resolvase; (B) mutation of residue D102 to all 
other amino acids; (C) random mutagenesis of resolvases which already contained 
D102y, E124Q, or both mutations. The inventors then observed the effects of 
combining activating mutations with each other and/or with mutations at the 2,3* 
inter&ce. 

Random mutagenesis of the resolvase catalytic domain 

Libraries of resolvase expression plasmids mutagenized throughout the 
catalytic domain (residues 1-140) by PCR-based methods (see Materials and 
Methods), or between residues 94 and 121 with oligonucleotides (Arnold et aL, 1999), 
were screened for resolution of pGal(rej x I) in vivo, by an assay in which resolution 
of a test plasmid, vdth a gene for GalK flanked by two recombination sites, is detected 
by formation of white (gallC) rather than red (galK^) colonies on MacConkey 
indicator agar plates (Arnold et al, 1999; Figure 4a). Sequencing of the resolvase 
expression plasmids from white colonies identified several hyperactive mutants, all of 
which were altered at residue 102 (Table 1). The present inventors noted that the 
single mutant Ml 031 was erroneously stated to be hyperactive in Arnold et al (1999); 
re-sequencing of the expression plasmid revealed an additional mutation D102A. 
M103I does not show detectable hyperactivity in the MacConkey assay, nor does it 
^en combined with E124Q (see above; Figure 4b). However, the D102AM103I and 
D102T M103T double mutants are more hyperactive than the corresponding D102 
single mutants (see below). 




The GIOIS and Q105L single mutants w«-e later found to be hyperactive 
(Figure 4b; see below), though they were not recovered from screens of mutagenized 
wild-type resolvase, probably because their resolution of pGal(res x I) was 



insufficient to produce distinctly paler single colonies amidst many red colonies. 



Saturation mutation of B102 

Residue D102 was mutated to all 19 other amino acid residues, by cloning 
synthetic oligonucleotides into the resolvase ORF of pAT5. The mutants were assayed 
as desoibed above; the results are summarized in Figure 4b. All 19 mutants resolved 
pGaI(re5 X res) which has two fiill res sites. pGal(I x I), a plasmid with no acc, i.e. 
just two copies of site I, was not resolved detectably in this assay by any D102 
mutant. pGal(rej x I) was resolved efficiently by the mutants D102Y, D102F, and 
D102I. D102W, D102V, and D102T had lower activity on pGzl(res x I), as indicated 
by a pinker colour of the colonies in the assay, D102A had barely detectable activity, 
and all other D102 mutants did not have detectable activity (i.e. red colonies). The 
mutant DIOOY was also tested, but was not hjrperactive (see Discussion). 

To assess further the effects of the activating D102 substitutions, some of 
them were combined with E124Q, and the properties of the double mutants were 
compared (Figure 4b). These results suggest that the most potent activating mutations 
ofD102aretoF, Y,or I. 

Random mutation of D102Y, E124Q, and D102Y E124Q ORFs 

The entire D102Y resolvase ORF was subjected to random mutagenesis by 
PCR-based methods. Mutants which resolved pGaI(I x I) retained D102Y, and had the 
additional mutations A117V or R121K or E124Q (Table 1; Figure 4b). The original 
A117V isolates had a third mutation, II38V. The D102Y A117V double mutant was 
active on pGaI(I x I), but less so than the original triple mutant. 113 8 V was not 
hyperactive as a single mutant, and the D102Y I138V double mutant did not resolve 
pGal(I X I) (data not shown). The D102Y E124Q double mutant had been created 
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previously by design (Arnold etaL^ 1999). The single mutants A117V, R121K, and 
E124Q resolved pGaI(r^5 x res\ but resolution of pGaI(r^j x I) or pGaI(I x I) was 
undetectable. 

Ei24Q resoivase was mutagenized by PGR, between residues 10 and 140. 
Libraries of mutants were screened for resolution of pGal(ray x I), which is not 
resolved by E124Q itself Second mutations which confer resolution activity were 
identified as F92S, GIOIS, D102V, D102Y, and Q105L (Table 1; Figure ***b). The 
GIOIS, D102Y, and Q105L (+E124Q) double mutants also resolved pGal(r x 1). All 
of the derived single mutants except F92S had detectable activity on pGdX^j-es x I) 
(Figure 4b). Some other D102 mutants had increased hyperactivity when combined 
with E124Q (see above). 

The entu*e resoivase ORF containing both D102Y and E124Q mutations was 
niutagenized by PGR. Because D102Y E124Q itself resolves pGal(I x I), an 
alternative method was used to select for E. coli containing resoivase mutants that 
were able to promote rapid resolution of a site I x site I plasmid pStr(I x I), thereby 
conferring streptomycin resistance (see Materials and Methods for details). Three 
mutants which rapidly resolved pStr(I x I) (and pGal(I x I)) had the additional 
mutations A89T, GIOIS, and V107M (Table 1). The derived double mutants A89T 
D102Y, GIOIS D102Y, and D102Y V107M all resolved pGal(I x I) (A89T D102Y 
less efficiently - pink colonies). A89T and V107M single mutants did not show 
detectable hyperactivity in the MacGonkey plate assay (Figure 4b). 

Combinations of activating mutations 

The results described above indicated that some combinations of activating 
mutations were more effective than single mutations. The present inventors therefore 
tested resolvases with several designed combinations of mutations, by making 
'cassettes' containing either four mutations of residues at or near D102 (GIOIS 
D102YM103I Q105L; M-cassette), or three mutations nearer the C-terminus (A117V 
R121K E124Q; C-cassette). Further mutants were then created by combining the 
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cassettes with D102Y, E124Q, or each other (Figure 4b). The M-cassette mutant 
promoted efScient resolution of of all three pGal test plasmids. Combination of the 
M-cassette whh E124Q (MQ) decreased activity on pG?d(res x I) and pGaI(I x I). The 
C-cassette mutant did not promote detectable resolution of any of the pGal plasmids, 
nor did the mutant containing both M- and C-cassettes (MC). Combination of the C- 
cassette with D102Y (YC) restored resolution of pGal(ref x res), but the other pGai 
plasmids were not resolved. 

Effect of mutations at the 2,3' interface 

Mutations of residues at the 2,3' interface abolish recombination activity in yS 
resolvase. Tn5 resolvase mutant in two 2,3* interface residues (R2A and ES6K; N- 
cassette; see Hughes et aL, 1990) was likewise completely inactive, in vivo (Figure 
4b) and in vitro (unpublished results). The triple mutants R2A E56K D102Y (NY) 
and R2A E56K E124Q (NQ) were also inactive, but in striking contrast, the quadruple 
mutant R2A E56K D102Y E124Q (NYQ) was hyperactive; more so than the double 
mutant D102Y E124Q. The M and MQ multiple mutants (see above) were also 
combined with R2A E56K (N-cassette), creating NM and NMQ multiple mutants. 
These proteins efiSciently resolved all three pGal test plasmids. 

In vitro properties of multiple hyperactive mutants 

Several of the hyperactive resolvases were over-expressed and purified. Their 
in vitro activities are summarized in Figure 5, and broadly agree with the phenotypes 
observed in vivo (Figure 4b). The present inventors analysed the multiple mutant M 
resolvase and its 2,3 '-defective derivative NM in detail. Both resolvases were active 
on a site I x site I supercoiled plasmid pTet(I x^i) in vitro, but NM resolvase was 
significantly more active. About half of the substrate was recombined in 4 minutes, a 
rate similar to that of wild-type resolvase on the standard resolution substrate pTet(rey 
X res) under similar conditions. Site I of res is functionally symmetric (Bednarz etal, 
1990), so as expected NM resolvase gave about equal amounts of resolution and 




inversion products from pTeiQi x I). There was no evidence of topological selectivity; 
a series of knots and catenanes, consistent with random collisions of sites^ was formed 
from single pTet(r x I) molecules, as well as products of recombination between sites 
on separate molecules. Unexpectedly, acc sequences inhibited recombination by N 
and NM resolvases; recombination of pTet(rej^ x I) was slow, and recombination of 
pTQt(res X res) was even slower. These mutants do not use acc to impose selectivity. 
Resolution was not preferred over inversion, and the distributions of product 
topologies from all three pTet plasmids were similar; there was no evidence of a 
preferred 2-noded catenane product from pTQt(res x Q or pTe^res x res). M and NM 
resolvases also promoted rapid intra- and intermolecular recombination between site 
Is on linear DNA molecules (unpublished results). 

DISCUSSION 
The role of acc 

Wild-type resolvase binds to a substrate containing two copies of site I, but 
does not catalyse any recombination, Acc sequences correctly positioned adjacent to 
both site Is (that is, a res x res substrate) are essential for efficient catalytic activity 
(Bednarz etal.^ 1990). Hyperactive mutants promote recombination between two site 
Is in the absence of one or both of the acc sequences. The mutations very likely 
disrupt, or allow circumvention o^ a natural regulatory mechanism which enforces 
acc-dependence, rather than conferring an intrinsically new functionality. Possible 
ways in which they might do this are considered in the next section. 

It was expected that hyperactive resolvase-mediated site I x site I 
recombination in plasmids would be topologically non-selective (see Arnold et aL, 
1999), because topological selectivity of wild-type resolvase involves its interactions 
with acc (see Introduction). More surprisingly, in vitro recombination by M, NM, and 
some other hyperactive mutant resolvases was inhibited by the presence of acc 
sequences in res x res or res x site I substrates, and the mutants do not use acc to 
specify a single product topology. Without wishing to be bound by theory, the present 
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inventors speculate that subunits of these mutants bound at res tend to make 
inappropriate interactions with each other or with subunits bound at a partner 
recombination site, which inhibit recombination and disable the proper function of 
acc. 

Activating mutations 

The present screens have been sufficiently thorough that the inventors are 
confident that all or nearly all the residues that can be mutated to give hyperactivity 
have been identified. They are all in the catalytic domain of resolvase, between amino 
acid riesidues 89 and 124 (except for the weakly enhancing mutation I138V) (see 
Figure 6). This region comprises the last two strands of the P-sheet that forms the core 
of the catalytic domain, theN-terminal part of the E-helix, and short connecting loops. 
Many of the residues in this segment of the polypeptide sequence are involved in the 
interface between the two subunits of the resolvase 1,2-dimer. 

D102 is the only single residue mutant which can promote complete resolution 
of pGal(re5 X I). The long E-helix which is a major component of the dimer interface 
b^ns at residue 103, and residues 99-102 make a loop coimecting the E-helix to the 
C-terminal strand of the catalytic domain core P-sheet (Yang and Steitz, 1995). D102 
(E102 in y5 resolvase) does not contribute to any of the known resolvase interfaces 
(see Arnold et al^ 1999, for more detmls). Mutating D102 to all other amino acids, 
remarkably, all 19 mutants resolved pGal(rej x res). Seven mutants were observed to 
be hyperactive. The sidechains of the activating substitutions (Y, I, F, V, T, W, and A, 
in approximate order of decreasing effect) are all uncharged and hydrophobic, but 
other hydrophobic residues (M, L, and P) do not activate detectably. The mutation 
GIOIS, of the residue preceding D102, is also activating, and the double mutant 
GIOIS D102Y promotes complete resolution of pGal(I x I), vwth no acc. 

Only one other single mutant, Q105L, was detectably hyperactive in the 
MacConkey assay. The equivalent y5 resolvase residue, K105, is within the E-helbc, 
but apparently does not participate in the dimer interface. The nearby activating 
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mutation V107M maps to y5 resolvase residue V107, whose sidechain is in a very 
different environment - deeply buried in the hydrophobic centre of the dimw. It 
interacts with residues in the N-terminal 'subdomain* (residues 1-100; see below) of 
its own subunit, and with the E-helfac of the other subunit of the dinier. 

The sidechains of the three residues A117, R121, and E124 are on the same 
face of the E-helix and contact the partner subunit of the dimer, at . or near its 
presumptive catalytic site (Yang and Steitz, 1995;). This group of interactions is 
present only once in the crystal structure of the DNA-bound 1,2 dimer, being one of 
its most obvious asymmetric features. The mutations A117V, R121K, and E124Q are 
all conservative changes, whose activating effect (in Tni resolvase) is only 
manifested in the presence of at least one other activating mutation (D102Y in all 
cases tested). 

Three other mutations, A89T, F92S, and I138V, had an enhancing effect when 
combined with other activating mutations (Figure 4b). A89 (S89 in y5 resolvase) is on 
the surface of the resolvase dimer. It has been noted to be on the putative interface of 
dimers in the 'DNA-out' tetramer (Sarkis et aL, 2001). The F92 sidechain makes 
various hydrophobic interactions, including one with the E-helix (LI 11) of its own 
subunit Residue 1138 (V138 in y5 resolvase) is the second residue beyond the C- 
terminal end of the E-helix; its sidechain contacts a deoxyribose of the DNA 
backbone in the minor groove. Possibly the enhancing eflFect of the I138V mutation is 
due to suppression of an undesirable interaction, as suggested for mutations at the 2,3* 
interface (see below). 

The identification of multiple mutants with stronger hyperactivity led the 
present inventors to construct two 'cassettes' with groups of mutations which were 
close to each other in the primary sequence. Resolvases with all four mutations 
GIOIS, D102Y, M103I, and Q105L promoted rapid recombination of site I x site I 
plasmids, in vivo and in vitro. In contrast, catalytic activity of resolvases with the 
three mutations Al 17V, R121K, and E124Q was severely reduced. 
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The ace-independent activity of hyperactive mutants is stimxilated by 
additional mutations at the 2,3' interface. The 2,3' inter&ce is clearly therefore not 
required for the catalytic steps of recombination (see also Grindley, 1993; Murley and 
"Gfrndley, 1998; SsiS^ efal, 20CF1). A recent model Eor the syhapfic complex (Sarlds 
efaL^ 2001) proposes that resolvase dimers bound at site I make 2,3* interactions with 
subunits in the rest of the synapse. Again without wishing to be bound by theory, the 
present inventors speculate that this interaction is pivotal to the mechanism of 
activation of catalysis in the natural system, but it may be superfluous and mildly 
inhibitory for mutants that do not require acc. 

Development of hybrid recombinases 

Hybrid recombinases have been developed which comprise a Tn3 resolvase 
catalytic domain linked to a ;dnc-binding domain, Zifi68. All the recombinases 
tested comprise residues 1-144 of the resolvase mutant "RMMD+", which has the 
following changes from wild-type; R2A, E56K, GIOIS, D102Y, M103I, Q105L. The 
first two mutations are to the "2,3'" interface, and the other 4 are "activating" 
mutations. In all cases the Zif268 domain has the wild-type sequence starting from 
residue 2 as given in the crystal structure paper (N.P. Paveletich and CO, Pabo, 
Science 252, 809 - 817 (1991). Between residue 144 of resolvase and residue 2 of 
Zif268 there is a "linker" sequence. The sequences tested are shown in Figure 2b. All 
proteins show activity; the most active in E. coli was the one with the linker marked 
with a big asterisk. 

The sites used are also shown in Figure 2a. The relevant ones are those 
mariced ZO, Z+2, etc. They comprise two invariant 9 bp motifs recognized by Zi£268 
(pale blue boxes with three little arrows inside), flanking a central invariant sequences 
made up of at least 13 bp of sequence from the centre of res site I (darker pink 
shading), and some varied "spacer" basepairs which change the distance between the 




28 



two Zifi68-bining motifs. The site marked with the big asterisk (Z+6) gave the 
highest recombination in E. coli (about 75% of substrate recombined after about 20 
generations of growth). Z+4, and Z+8, 10, 12 also showed activity. The ZO and Z+2 
sites were inactive. 

All combinations of hybrid proteins and sites shown in Figures 2a and b have 
been tested in E. coli. The ZO and Z+6 sites, and two hybrid proteins corresponding 
to the linker marked with the asterisk and the one at the top of the list, have been 
tested in vitro. In vitro, the Z+6 site recombines much better than the ZO site, which 
is almost inactive. 

Based on the detailed knowledge of TnJ site-specific recombination and 
mutants described above, the intention is to design novel systems for promotion of 
DNA rearrangements in for example higher eukaryote cells. It is first necessary to test 
and show that resolvase mutants can recombine substrates containing two copies of a 
minimal 28 bp recombination site (*site I';), in mammalian cell lines. The methods to 
be used are quite well established, Groth, et al. 2000 and Schwikardi and Dr6ge, 
2000. Experiments may initially be in two or three standard cell lines, for example 
COS-1, 3T3, or 293 cells. A mutant resolvase will be expressed in the mammalian 
cells from a suitable plasmid derived from available vectors, with a standard promoter 
such as the SV40 early or CMV immediate early viral promoters, and a transcription 
terminator/polyadenylation signal. Further experiments may involve quantitative 
estimation of the extent of recombination, using constructs in which recombination 
changes expression of a reporter gene (eg. luciferase and/or GPP). To determine the 
cellular localization of the mutant resolvases, it is possible to create fusions with 
green fluorescent protein (GFP) by established methods, and analyse cells expressing 
the fusion proteins by microscopy. These fusion proteins will also allow easy 



determination of transfection efficiency. The present inventors have already 
demonstrated foil recombination activity by resolvase-GFP fosion proteins in vitro 
and in K coli. If it proves to be desirable, it is possible to attach a nuclear localization 
signal to the resolvase coding sequence. 

It is then possible to compare the efficiencies of existing hyperactive resolvase 
mutants, to identify the features of the recombinase that are most important for 
efficient recombination at minimal sites in the cell lines, and to create potentially 
improved versions of the system. Further optimization of recombination activity may 
be achieved by selection strate^es, which will be easily adapted from established 
methods for selection of resolvase mutants in K coli see above. For example, it is 
possible to make a construct that contains a gene for a hyperactive resolvase, adjacent 
to a p^ of recombination sites flanking a marker gene. Libraries of mutants in the 
resolvase ORF may be created, e.g. by PGR mutagenesis and in vitro 'shuffling*. 
Cassettes which recombine upon transfection into mammalian cells can be recovered 
by PCR amplification, and are likely to encode an active resolvase. The sequences 
encoding the active resolvases can be subjected to further mutagenesis if requfa-ed. 
The same plasmid constructs can be used for selection experiments in K coli. 

The natural res site I is functionally symmetrical, so either excision or 
inversion can occur in a substrate with two sites. Alteration of the 2 bp sequence at 
the centre of site I can break this symmetry, so that only one type of event (resolution 
or inversion) is allowed; this restriction might be desirable for most biotechnology 
applications, and it is straight-forward to test the properties of such sites in cell lines. 
It is predicted that other simple mutations in the sequence of site I might increase 
efficiency of recombination in the cell lines, and reduce the likelihood of reversal of 
the rearrangement by a second round of recombination. 




Successful demonstration of efficient recombination following co-transfection 
may be followed by more stringent tests requiring excision of a marker gene which is 
inserted into the chromosomal DNA at random sites. This will more accurately reflect 
the situafion m applications of the system, when the sequences of interest are typically 
associated with nuclear chromatin. Random integrants could be created by 
transfection of a suitable plasmid substrate followed by selection for a gene encoded 
by the plasmid, for example neomycin resistance. An intermediate approach which 
might prove to be very useful is to construct substrate plasmids containing sequences- 
from Epstein-Barr virus, which have been shown to be located in the cell nucleus, 
chromatin-associated, and replicated with the chromosomes, thus being very good 
models for chromosomal DNA. These plasmids can be scored for recombination by 
isolation of cell DNA and transformation of £. coll 

It is also possible to optimize resolvase-ZiQ68 hybrid recombinases as 
exemplified for activity in mammalian cells, using methods analogous to those 
described above for the intact resolvase protein. The structure of Zif268 bound to 
DNA has been solved, (Elrod-Erickson, et al., 1996) and it is the focus of studies 
aiming to create engineered zinc finger proteins that can recognize any defined short 
DNA sequence. The properties of new variants of the hybrid recombinase will be 
studied first in E. colU and in vitro, to allow for more detailed analysis and 
troubleshooting; suitable candidates will then be tested in mammalian cells. Further 
improvements in activity should be achievable by the application of mutagenesis 
followed by selection methods as described above. If these prototype recombinases 
are functional, it should be feasible to CTeate analogous systems which recombine at a 
wide variety of synthetic sites, by replacing the natural Zi£268 domain with known 
mutant versions that recognize different sequences (Chou & Isalan, et al., 2000 and 
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Wolfe et al., 1999). This would create the potential for applications of site-specific 
recombination technology where two or more recombination events can be promoted 
independentiy in the same cell. 

A straightforward pension of this approach is to attempt to target site- 
specific recombination to natural sequences in genomic DNA. This would be 
achieved by replacing the Zif268 domain of the hybrid recombinase with an altered 
version engineered to recognize part of the target sequence. Serine recombinases are 
much more promising for this type of application than the tyrosine recombinases such 
as Cre; tyrosine recombinases are not obviously divisible into ^catalytic' and 'DNA 
recognition' domains, and require more homology between the recombining sites (6 - 
8 bp, versus only 2 bp for serine recombinases). One application would be the 
taigeting of a recombinase to a sequence in the human immunodeficiency virus (HIV) 
provirus. Excisive recombination between the two LTRs of the. pro^drus (roughly 9 
kbp apart) would eliminate it from the genome, thereby providing a potential basis for 
therapy. Additionally it may be possible to target other genomic sequences. One such 
application would be targeted integration of gene cassettes at bovine casein gene loci, 
with the aim of creating transgenic animals which can produce large quantities of 
pharmaceutically useful proteins (Wilmut et al., 1991). 

Suitable sites for targeting will have a sequence resembling as &r as possible 
the central basepairs of res site I, which are contacted by the N-terminal domain of 
resolvase and thus affect the efficiency of catalysis of strand exchange, flanked by 
sequences that can be recognized by one or two engineered versions of the Zi£268 
DNA-binding domain. Most potential target sequences will have insufficient dyad 
symmetry for strong binding by both subunits of a dimer of a single hybrid 
recombinase. However, evidence from current studies indicates that strong binding by 
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only one subunit of the resolvase dimer can lead to eflScient recombination at a 
minimal site. A more sophisticated solution of this problem, if it turns out to be 
necessary, would be to express two versions of a hybrid recombinase, which could 
form heterodimers with appropriate sequence recognition properties. The Zi£268 
domain(s) of the hybrid recombinase may therefore be modified for optimal binding 
at one or both of these sequences, based on the latest published information (see Choc 
& Isalan, 2000), Sequence recognition could be improved if required, by established 
selection methods for zinc finger proteins (Isalan et al., 2001). Substrates containing 
two copies of the potential recombination site may be constructed and analysed as 
described above. In the case of targeted integration, efficiency will be improved by 
optimization of the sequence of the recombination site associated with the gene 
cassette to be integrated. It may also be possible to reduce or eliminate the possibility 
of reversal of the integration reaction, by design of the cassette-associated site, or by 
incoiporating features fi-om the 4>C31 integration-specific serine recombinase system, 
which is also being actively studied for potential uses in mammalian cells (Groth et al, 
2000). 
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